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ABSTRACT 

We produce catalogues of voids for SDSS DR7 redshift survey and for a Millennium I 
simulation mock data. The mock catalog is constructed such that it closely represents SDSS 
DR7 survey. We carry a parallel analysis of the two catalogues and find that in both the obser- 
vation and the simulation, voids tend to be equally spherical. The total volume occupied by 
the voids and their total number are slightly larger in the simulation than in the observation. 
We find that large voids are less abundant in the simulation and the total luminosity of the 
galaxies contained in a void with a given radius is on average higher than observed by SDSS 
DR7 survey. We expect these discrepancies to be, in fact, even more important than found 
here since the present value of as given by WMAP7 is lower than the value of 0.9 used in 
the Millennium I simulation. The reason why the simulation fails to produce enough of the 
large and dark voids could be due to the failure of certain semi-analytic models of galaxy 
formation in reducing the small-scale power of ACDM and in producing sufficient power on 
large scales. 
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1 INTRODUCTION 

Redshift surveys have been demonstrating for a few decades that 
galaxies are distributed on a cosmic web of filaments, walls and 
clumps. These structures which form on a hierarchy of scales and 
span a large redshift range bound regions of low luminosities that 
are mostly devoid of observable galaxies. These "void" regions oc- 
cupy more than %80 of the volume of the observable Universe. 
Since the discovery of voids using Zwicky clusters (Einaste Tet.al.| 
||1980| ) and the discovery of the first giant or supervoid in Bootes 
constellation (Kirshner et al. 1981 ) numerous works have followed 
(|Zeldovich et al.|1982l|Davis et al.|1982[|de Lapparent etaL 



|da Costa et al.|1988||Geller & Huchra|1989||da Costa et al. 



( |Peebles|200l]|Tully et al.|2008l|Tikfaonov & Klypin|2009| > Some 
studies have also shown that voids in observations are significantly 
larger than those in simulations ( |Ryden & Turner|1 984). Although 
modifying models of galaxy formation might solve these problems 
and various remedies such as proper biasing and halo occupation 
distribution have been proposed ( |Hoyle et al.||2005| |Tinker et al.| 
2008 ), different studies suggest that the problem would still persist 
(Bot hun et al.|1986[|Little & Wein berg|19"94l|Plionis & Basilakos 
2002[ |Gottlober et al.|2003[|Hoyle & Vogeley|20041|Goldberg et al. 
2005l|Hoe7t et al.|2006) 



and diverse algorithms for void identification have been developed 
and applied to larger and more complete surveys (see e.g. Vogeley 
|eT^ ( [TWl >; |Neyrinck| ( [2008l )). 

The formation and evolution of voids is well-understood in the 
framework of gravitational instability (Zeldov ich et al.|1982||Shan-| 
darin & Zeldovich 1989 ). However, when comparing void proper- 
ties between observations and simulations based on ACDM certain 
problems still remain to be better understood. By definition, voids 
are devoid of galaxies or contain negligible number of faint galax- 
ies. The perplexing issue is that we do not see a large population of 
low-mass galaxies populating voids ( [Klypin et.al. (1999 ); Moore 
|et.al. | ( |1999| >) and furthermore the void galaxies that we see are 
basically representative of the general population (Peebles 2001 ). 

Observed voids seem to contain fewer galaxies and in par- 
ticular dwarf galaxies contrary to what is expected from ACDM 



The problem of empty and large voids could be due to ACDM 
having too much power on small scales and hence to the problem of 
over- abundance of substructures (Tikhonov & Klypin 2009). Sub- 
structures would occupy the voids making them less empty and 
statistically they could break larger voids into smaller ones. On the 
other hand, one could equally infer that ACDM lacks power on 
large scales, perhaps due to the fact that the value of ag is too low. 

In this work, we study this problem by analysing voids in the 
SDSS DR7 data and also by carrying a parallel and comparative 
analysis on a mock SDSS DR7 catalog based on the Millennium I 
simulation. Our void-finder algorithm is an improved and general- 
ized version of the original algorithm proposed by Aikio & Mae- 
hoenen (1998 ). The important feature of this algorithm is that it 
does not assume a priori that voids are spherical and hence can be 
used to study the shapes of the voids. We apply our void-finder al- 
gorithm to the Sloan Digital Sky Survey SDSS DR7 and build a 
catalog of voids. In parallel, we also apply our algorithm to a mock 
SDSS DR7 catalog which we construct out of the Millennium I 
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simulation. The mock catalog is given the same magnitude cut-off 
as SDSS DR7 and in a different version, we also make a mock cat- 
alog which has the same number density as SDSS but a different 
magnitude cut-off. This allows us to compare various properties of 
observed voids to those predicted by ACDM and the semi-analytic 
model of galaxy formation. 

In Section [2] we present our sample, made out of SDSS DR7 
catalog. In Section[3] we present our mock catalog. In Section|4] we 
explain our void-finder algorithm. In Section[5] we find the voids in 
the simulation and observation catalogues and discuss the numbers, 
sizes and shapes of the voids. In Section[6] we study the abundance 
of large voids in the observations and the mock catalogues. In Sec- 
tion [7] luminosities of voids as a function of their sizes are pre- 
sented and compared between the simulation and the observation. 
In Section|8] we conclude. 



2 THE SDSS DR7: DEFINITION OF THE SAMPLE 

We have selected the main galaxy sample of the 7th data release 
of the Sloan Digital Sky Survey (SDSS DR7) ( |Abazajian et~aT] 
2009). The redshifts of the galaxies are corrected for the motion 
of the local group and are give in the CMB rest frame. The k- 
corrections for the SDSS galaxies are calculated using the KCOR- 
RECT algorithm developed by Blanton et.al. ( 2003a) and Blanton 
|& Ro weis (2007). The boundaries of our selected region of SDSS 
is: 135 < RA < 235 and < DEC < 40 which contains 283076 
galaxies. All of the objects in this selected region have redshift error 
smaller than 2.5 x 10 -4 and the errors in their apparent "Petrosian" 
magnitudes of r-band, m r , are smaller than 0.1. The absolute mag- 
nitudes of galaxies are determined in the r-band using cosmological 
parameters; H = 100 and the density parameters Q m = 0.25 and 

= 0.75. Galaxies belonging to voids are identified by using 
a volume-limited sample taken from the selected region. The fi- 
nal subsample contains 68702 galaxies with absolute magnitudes 
M r < —19.9 which lie in the comoving distance interval 75-325 
h _1 Mpc, corresponding to 0.02 < z < 0.12. 

The selected region of the SDSS DR7 is shown in the left 
panel of Fig. 1. The right panel of this figure shows the plot of 
the absolute r-band magnitude versus comoving distance. The dark 
region in this plot illustrates the selected volume-limited sample 
which is used in this work. 



3 MOCK MILLENNIUM I CATALOG: DEFINITION OF 
THE SAMPLE 

The millennium I simulation is run with N = 2160 3 particles in a 
comoving box of length L = 500/i _1 Mpc with a mass resolution 
of 8.6 x 1O 8 /i _1 M . The adopted cosmology is a ACDM model 
with On 0.25, Q b 0.045, Q A 0.75, h 0.73, n 1 
and as = 0.9. This value of ag is larger than its present value of 
0.8 given by WMAP7 (Komatsu et. al. |20iT| , hence yielding more 
power on larger scales. The evolution of baryons within these dark 
matter halos is predicted by different semi-analytic models. Current 
semi-analytic models try to incorporate various complex processes 
such as gas cooling, reionization, star formation, supernova feed- 
back, metal evolution, black hole growth and AGN feedback (e.g. 
|Bower et.al.| ( [2006l ) |DeLucia & Blaizotl ( |2007t|Gu^ 
Although the semi-analytic models are designed to match the ob- 
servational data as closely as possible, they can still fail in cer- 
tain aspects, for example the low-mass galaxies with stellar-mass 
« 1O 9 M ) are slightly over-predicted. Consequently to remedy 
this problem, supernova feedback, a modified law for star forma- 
tion, or a different cosmological model are evoked (s ee e.g. [G uo et 
|aL](|20TT] >; [Bower et al.|{20l2) ; |Wang efaL] ( [20T2t ; [Menci et.al. 
(2012) ). 

In this work, we use the mock galaxy redshift catalog of 
Blaizot-ALLSky-PT-1 1 which was designed to mimic the SDSS 
and has an almost identical redshift distribution and a very similar 
color distribution. This mock catalog was constructed by Blaizot 
|et al.| ( |2005] ) using the Mock Map Facility (MoMaF) code and the 
semi-analytic model presented in De Lucia & Blaizot (2007 ). Fur- 
thermore, In order to have a mock catalog that resembles SDSS 
DR7 galaxy survey as closely as possible, we select a region in the 
simulation that lies in the same redshift range (0.02 < z < 0.12) 
and has the same geometry. Our mock volume-limited sample in- 
cludes 68701 galaxies with stellar mass larger than 1O 9 M and 
brighter than M r < —20.16, roughly representing the galaxies 
brighter than M r < —19.9 in the SDSS DR7 sample and covering 
a volume of 1.2 x 10 7 (Mpc/h) 3 in the volume-limited SDSS DR7. 
Consequently, the simulation sample has the same galaxy number 
density as the SDSS DR7 sample. 



1 http : //w w w. g vo . org/Millennium/Help ?page=databases/ 
mpamocks/blaizot2006 _allsky 
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Figure 2. The right panel shows the initial Voids in the observational data of SDSS DR7 and the Left panel shows the final voids after the omission of small 
and edge voids. 



Table 1. Characteristics of our volume-limited samples. 



Observation Simulation 



Sample Volume (Mpc/h) 3 « 


1.2 x 10 7 


» 1.2 x 10 7 


Number of galaxies 


68702 


68701 


Number of field galaxies 


5873 


5377 


Number of wall galaxies 


62829 


63324 


Number of void galaxies (field + faint) 


26859 


43666 


Mean galaxy separation (Mpc/h) 


6.22 


6.35 



4 THE VOID-FINDER ALGORITHM 

Various definitions of voids have been previously suggested (Kir- 
shner et al.|[T98T| |Kauffmann & Fairall|| 19911 |Sahni et aL|fT994l 
Bens on et al.||2003| and a number of void-finding algorithms, a 
number of which presume voids to be nearly spherical, have been 
developed (see e.g. Hoyle & Vogeley (2002)). We have tried to 
avoid this limitation and have developed a method based on the 
original algorithm of Aikio & Maehoenen (1998 ) (Hereafter AM 
Algorithm). The AM algorithm was originally written in 2D and 
we have extended it to 3D and adapted it for application to large 
datasets. The algorithm does not constrain the voids to be of any 
particular shapes and hence can be used to study the shapes of the 
voids and their deviations from sphericity. 

Prior to the application of AM algorithm to our Volume- 
limited galaxy sample, we classify galaxies as wall or field galax- 
ies. To distinguish between wall and field galaxies, we introduce 
the parameter d which is related to the mean distance of the third 
nearest neighbour, cfa, and the standard deviation of its value, a, 
by the following expression: (d = d% + 1.5a) ( |Hoyle & Vogeley 
2002 ). In our volume-limited galaxy sample, all galaxies with third 
nearest neighbour distance, d%, greater than this selection parame- 
ter, d, are taken to be field galaxies and removed from the galaxy 
sample and the remaining objects are identified as wall galaxies. 
We remark that a field galaxy may lie within a void region, hence 
a void galaxy, whereas wall galaxies all lie in the cosmic filaments 
and clusters and by definition are not to be found in voids. 

We find that the selection parameter, d, for observation and 
simulation data are 5.96 and 6.16 Mpc/h respectively which results 
in 9% of the galaxies in the observation and 8% in the simulation 



being identified as field galaxies. The details of the samples are 
given in Table 1 . 

To implement the AM algorithm, wall galaxies are gridded up 
in cells of size 1 Mpc/h. The AM algorithm starts on the Cartesian 
gridded wall galaxy sample by defining a Distance Field (DF). For 
a given grid in three dimensional galaxy sample the DF is defined 
as the distance to the nearest particle. Then according to the value 
of DF for the closest neighbours of each grid, the local maxima 
of DF subvoid is calculated. In order to assign each element in the 
grid sample to a subvoid, the "Climbing Algorithm" ( Schmid tet al.| 
2001 ) is used where for a unit cell bounded by the grid points, i.e. 
an elementary cell, the gradient in DF to each of the neighbouring 
cell is calculated. In this method, the elementary cell and every 
other cell along the climbing route is then assigned to a subvoid. 
Finally if the distance between two sub voids is less than both DFs 
then they will be joined together into a larger void. 

The void volume is estimated using the number of grid points 
inside a given void multiplied by the volume associated with the 
grid cell. For each void, we define its effective radius (r e ff) as the 
radius of a sphere whose volume is equal to that of the void. 

The configuration of each void in this algorithm depends on 
the grid points and subsequently we determine the void centre as 
the centre of mass identified by the positions of the grid points 
that enclose an elementary cell. Following this standard method 
and giving the same weight to all elementary cells, the centre of 
each void can be written as 

N 

x> v = i/nY j < CD 

i=l 

where x\ (j = 1, 2, 3) are the locations of elementary cells and N 
is the number of cells in the void V. The shape of a voids is then 
characterized by the ratio of the total number of grid points, which 
lie between its centre and its effective radius, to its volume. This 
ratio is an indicator of the deviation of the shape of the void from 
sphericity. Ideally, for a spherical void this ratio is equal to one. 

In the next section, we apply this algorithm to the SDSS DR7 
and the mock catalog to construct catalogues of voids and to study 
their characteristics. 
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Table 2. Statistics of void in the observation of SDSS DR7 and mock simulated catalogues 







Observation 




Simulation 


Number 


Volume (Mpc/h) 3 


Number 


Volume (Mpc/h) 3 


All Voids 


4616 


12541454 


4847 


12555147 


Edge Voids 


1148 


7844214 (62.5%) 


1193 


7646672 (61%) 


Small Voids (r eff < 7Mpc/h) 


3001 


722062 (5.8%) 


3085 


845753 (6.7%) 


Voids in the final sample 


467 


3975178 (31.7%) 


569 


4062722 (32.3%) 



Table 3. The sizes and sphericities of voids in the observation and simulated mock catalogues 





Effective Radius (Mpc/h) 


Max-length (Mpc/h) 


Surface (Mpc/h) 2 




Sphericity 




Max Min Median 


Max Min Median 


Max Min Median 


Max 


Min Median 


Observation 
Simulaion 


30.47 7.02 9.65 
28.15 7.00 9.08 


108.6 19.9 32.3 
103.1 19.1 30.1 


35414 1214 2588 
33018 1210 2276 


0.82 
0.84 


0.22 0.71 
0.12 0.72 



5 VOIDS IN THE SDSS DR7 REDSHIFT SURVEY AND IN 
THE MOCK CATALOG 

We have identified 4616 and 4847 voids of different sizes and 
shapes in the SDSS DR7 survey and in the mock catalog respec- 
tively. We have avoided problems due to boundary effects, by se- 
lecting voids that lie completely inside geometrical boundaries of 
our catalogues. Therefore, Edge voids, those that touch the sur- 
vey boundaries, are removed from our void catalog due to the their 
under-estimated volumes and distorted shapes (see Fig.2). 

The size of each void is characterized by its effective-radius, 
defined in the previous section. To avoid counting spurious voids, 
we set a threshold of 7 Mpc/h for the minimum size of effective 
radii of voids in both samples. This threshold is larger than mean 
distance between galaxies in the sample and helps to eliminate 
seemingly Small Voids from the sample. After removing all of the 
spurious voids, we end up with about 467 and 569 voids, in our 
volume-limited sample of SDSS DR7 survey and mock simulation 
data respectively which occupy ~ 32% of the volumes of the sam- 
ples. In Table[2] the statistics of voids are given. Hereafter all of the 
analysis are carried out on voids in the final sample, obtained after 
the elimination of small and edge voids. 

Table 3 compares the statistical properties of voids in the 
observed and mock catalogues. It shows that the median of void 
sphericity in both samples are nearly ^0.70 which indicates that 
voids tend to be mostly spherical. Fig.|3]also shows that voids tend 
to become more spherical with increasing radii. There is a good 
agreement between the mock catalog and the SDSS observation al- 
though the observed voids seem to be marginally more spherical 
in general. More and better data is needed to see if the marginal 
difference reported here is of any significance. 



6 ABUNDANCE OF LARGE VOIDS: THE SDSS DR7 
OBSERVATION VERSUS THE MOCK CATALOG 

We have compared the distribution of the sizes of voids in the ob- 
servation against the simulated mock catalogues. Fig. [4] shows that 
the volume occupied by voids is larger in the simulation than in the 
observation. In particular, both the histograms and the commula- 
tive plots show that the largest voids are absent from the simulation 
whereas they are present in the observation. 

The problem of large voids could be related to the over- 
abundance of small galaxies which would subsequently divide large 



voids into smaller ones. However, this could be resolved by proper 
biasing in modeling of galaxy formation and evolution. Hence, the 
problem of large voids could be due to the shortcoming of the 
semi- analytic model of galaxy formation for the mock catalog that 
we have used here. A recent study that also compared the SDSS 
DR7 voids with those taken from a SPH simulation and a halo- 
occupation model and hence uses a different model of galaxy evo- 
lution, seems to indicate that the distribution of the sizes of voids 
agree in the two samples (Pan et al. 2012). Hence, such void prop- 
erties could be of potential importance in distinguishing between 
different galaxy formation scenarios. 



7 OBSERVED SDSS VOIDS ARE LESS LUMINOUS 
THAN THOSE IN THE MOCK CATALOG 

Prior to comparing the luminosities of the voids between simulation 
and observation, we need make sure that there is no bias between 
the two samples. In Fig. [5] we have plotted the histogram of the 
absolute magnitudes of field and faint galaxies that are found in 
the voids in the two catalogues. The figure shows that although the 
number of void galaxies in mock catalog is larger than that in the 
observation, the distributions are the same in both catalogues. Min 
and Max magnitudes are nearly the same, namely M ~ —16.5 in 
the and M~ -22 in the observation and the simulation. This demon- 
strates that there is no bias between the two samples. 

We comment that the void galaxies could be field galaxies or 
be field and faint galaxies. We recall that the field galaxies are in the 
luminosity ranges M< — 19.9 in the observation and M< — 20.16 
in the simulation, but faint galaxies are less luminous than these 
threshold set in our volume-limited sample (see Fig [5]). 

We have compared both the total luminosity of the voids and 
their luminosity per unit volume between the observation and the 
simulation. The comparisons are shown in Fig. [6] The lower panel 
of Fig. [6] shows that If we consider faint and field galaxies, large 
voids are clearly more luminous in the mock catalog than in the 
observation. However, the top panel of Fig.[6]shows that if we con- 
sider only field galaxies, then this discrepancy becomes less promi- 
nent. We emphasize that the minimum magnitude cut-off for both 
samples is nearly the same when faint galaxies are considered (see 
Fig[5j. This discrepancy could be a sign of the over- abundance of 
small faint galaxies in the simulation. The problem of empty voids 
could be related to the lack of large power of ACDM, in spite of 
the fact that the value of ag used here is 0.9 which is larger than its 
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Sphericity Effective Radius (h _1 Mpc) 

Figure 3. Left panel shows the distribution of sphericity is skewed towards larger sphericities, i.e. voids are mostly spherical. The right panel shows a plot of 
the sphericities versus the equivalent radii of the voids demonstrating that voids become more spherical with increasing radii. There is no significant difference 
between the observation and the simulation and more data would be needed to establish any disagreement between the two. 





Effective Radius (h _1 Mpc) Effective Radius (h _1 Mpc) 

Figure 4. Top panel: Distribution of the sizes of voids in the observation and the simulation: larger voids are more abundant in the observation. Bottom 
panel: Cumulative plots of number of voids against their equivalent radii shows again that larger voids are more abundant in the observation. The bottom plots 
show the volume/radius cumulative curves where both the commulative volume and normalize volumes are plotted against the effective radii of the voids. The 
histograms show that at large radii, there are more voids in the observation than in the simulation. The lower panel demonstrate that the number and volume 
of voids are, in general, higher in the simulation than in the observation (see Table.2). 
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Absolute Magnitude of Void Galaxies 

Figure 5. Number of void galaxies is plotted against their absolute magnitudes. The range of luminosities of void galaxies are nearly the same for both 
simulation and observation, hence demonstrating that there is no bias imposed on the calculation of the luminosities of voids. The figure shows that voids in 
the simulation contain more galaxies, in almost all magnitude bands and hence are more luminous than those in the observation. 



present value of 0.8 given by WMAP7. Hence, this discrepancy is 
expected to be more significant, when using WMAP7 value of as, 
than found here. 
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8 CONCLUSION 

In this paper, we have carried a parallel study of the voids in the 
SDSS DR7 redshift survey and in a mock catalog. The later is ex- 
tracted from the Millennium I simulation and aims at replicating 
observational biases and limitation of the SDSS DR7 catalog. 

We have found that the total number and the volume occupied 
by the voids are larger in the simulation than in the observation. 
We find 467 voids in SDSS DR7 and 569 in the mock catalog. 
The voids pseudo-radii or effective radii (i.e. radii of an equiva- 
lent spherical volume) range from 7 to 3 1 Mpc/h. The sphericities 
of voids also have similar distributions in the observation and the 
simulation. The voids also tend to become more spherical with in- 
creasing effective radii. Furthermore, large voids are less abundant 
in the simulation and the mean luminosities of void, as defined by 
the sum of the luminosities of the galaxies they contain, are larger in 
the simulation. The problem of abundance of large voids, could be 
related to the problem of over- abundance of small haloes in ACDM 
that would then divide large voids into smaller ones in the simula- 
tion. However, this problem is usually taken care of in models of 
galaxy formation by suitable biasing or quenching of galaxy forma- 
tion on small scales. The persistence of this problem could demon- 
strate that the semi-analytic model of galaxy formation used in the 
mock catalog does not suppress galaxy formation in small voids 
efficiently. 

We have also found that voids are in general more luminous in 
the simulation than in the observation. This could be related to the 
lack of power of ACDM on large scales. The value of a 8 used in the 
Millennium I simulation has the high value of 0.9 as compared to 
the value of 0.8 given by the WMAP7. The problem of empty voids 
could then become even more significant if the current value of as 
was used in the simulation. Hence, either the ingredients used in the 
semi-analytic model do not correctly reproduce the observations or 
on a more fundamental level, the power spectrum of ACDM has 
too much power on small scales and too little on large scales and 
these cannot be remedied by realistic models of galaxy formation. 
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Figure 6. Top panel: The total luminosity and the luminosity density of field galaxies are ploted against the effective radii of the voids to which they belong. 
Larger voids are less luminous in the observation than in the simulation. This disagreement becomes more significant when faint objects are also taken into 
account, as shown in the two plots of the lower panel. Observed voids are clearly less luminous than simulated voids. Note that the luminosity cutoffs are the 
same for the observation and the simulation when faint galaxies are taken into account. We expect this discrepancy to be even more significant than shown 
here because our Millennium I simulation uses a larger value of as than given by WMAP7. 
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